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Abstract. A key problem in quantum computing is finding a viable techno- 
logical path toward the creation of a scalable quantum computer. One possible 
approach toward solving part of this problem is distributed computing, which 
provides an effective way of utilizing a network of limited capacity quantum 
computers. 

In this paper, we present two primitive operations, cat-entangler and cat- 
disentangler, which in turn can be used to implement non-local operations, e.g. 
non-local CNOT and quantum teleportation. We also show how to establish an 
entangled pair, and use entangled pairs to efficiently create a generalized GHZ 
state. Furthermore, we present procedures which allow us to reuse channel 
qubits in a sequence of non-local operations. 

These non-local operations work on the principle that a cat-like state, 
created by cat-entangler, can be used to distribute a control qubit among mul- 
tiple computers. Using this principle, we show how to efficiently implement 
non-local control operations in many situation, including a parallel implemen- 
tation of a certain kind of unitary transformation. Finally, as an example, we 
present a distributed version of the quantum Fourier transform. 



1. Introduction 

Distributed computing provides an effective means of utilizing a network of 
limited capacity quantum computers. By connecting a network of limited capacity 
quantum computers via classical and quantum channels, a group of small quantum 
computers can simulate a quantum computer with a large number of qubits. This 
approach is useful for the development of quantum computers because the earliest 
useful quantum computers will most likely hold only a small number of qubits. This 
constraint limits the usage of such quantum computers to small problems. 

We propose that distributed quantum computing (DQC) is a possible solu- 
tion to this problem. Furthermore, even if one could construct a large quantum 
computer, the distributed computing model can still provide an effective means of 
increasing computational power. 

By a distributed quantum computer, we mean a network of small quantum 
computers, connected by classical and quantum channels. Each quantum computer 
(or node) has a register that can hold only a limited number of qubits. Each node 
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also possesses a small number of channel qubits which can be sent back and forth 
over the network. A qubit in a register can freely interact with any other qubit in 
the same register. It also can freely interact with channel qubits that are in the 
same node. To interact with qubits on a remote computer, the qubits have to be 
teleported or physically transported to the remote computer, or have to interact 
via non-local operations. 

Indeed, distributed quantum computing can simply be implemented by tele- 
porting or physically transporting qubits back and forth. A more efficient imple- 
mentation of DQC has been proposed by Eisert et al [T] using a non-local CNOT 
gate. Since the control NOT gate (CNOT) together with all one-qubit gates is 
universal [2], a distributed implementation of any unitary transformation reduces 
to the implementation of non-local CNOT gates. Eisert et al also prove that one 
shared entangled pair (ebit) and two classical bits (cbits) are necessary and suffi- 
cient to implement a non-local CNOT gate. 

In this paper, we present two primitive operations, cat-entangler and cat- 
disentangler, which in turn can be used to implement non-local operations, e.g. 
a non-local CNOT and quantum teleportation protocol. The cat-entangler and 
cat-disentangler can be implemented using only local operation and classical com- 
munication (LOCC), assuming that an entanglement has already been established. 
We show how to establish an entangled pair between two nodes, and use entangled 
pairs to efhciently create a generalized GHZ state. Furthermore, we present proce- 
dures which allow us to reuse channel qubits in a sequence of non-local operations. 

To implement a non-local CNOT gate, first an entangled pair must be estab- 
lished between two computers. Then the cat-entangler transforms a control qubit 
a\0)+i3\l) and an entangled pair ^(|00) -I- |11)) into the state a|00) -)-/3|ll), called 
a "cat-like" state. This state allows two computers to share the control qubit. As 
a result, each computer now can use a qubit shared within the cat-like state as a 
local control qubit. After completion of the control operation, cat-disentangler is 
then applied to disentangle and restore the control qubit from the cat-like state. 
Finally, the channel qubits arc then be reset so that the entangled pair can be 
re-established. 

To teleport an unknown qubit to a target qubit, we begin by establishing an 
entangled pair between two computers. Then we apply the cat-entangler operation 
to create a cat-like state from an unknown qubit and the entangled pair. After 
that, we apply a slightly modify cat-disentangler operation to disentangle and re- 
store the unknown qubit from the cat-like state into the target qubit. Finally, we 
reset the channel qubits. In other words, quantum teleportation can be considered 
as a composition of the cat-entangler operation followed by the cat-disentangler 
operation. 

The cat-entangler and cat-disentangler operations can be extended to a multi- 
party environment by replacing the entangled pair with a generalized Greenberger- 
Horne-Zeilinger (GHZ) state (also called a cat state) expressed as -^(|00 . . . 0) -I- 
|11 . . . 1)). The state a|00 ... 0) + . . . 1) (also called a cat-like state) can be 
created using only LOCC. A cat-like state can be used to share a control qubit 
between multiple computers, allowing each computer to use a qubit shared within 
the cat-like state as a local control line. In many cases, this idea leads to an efficient 
implementation of multi-party control gates. In addition, a parallel implementation 
of some unitary transformations can also be realized. 
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Before performing any non-local operation, an entanglement between comput- 
ers must be established. Moreover, the entanglement has to be refreshed after its 
use. Brennen, Song, and Williams address this issue in a lattice model quantum 
computer using entanglement swapping j7j, which can be used to establish and 
refresh entanglement between two qubits. The multiple entanglement swapping |8] 
can create a generalized GHZ state, which is used by multiple computers. 

We address these same issues for the quantum network model by showing how to 
establish two entangled pairs by sending two qubits, one in each direction. Asymp- 
totically, this is equivalent to establishing one entangled pair at the cost of sending 
one qubit. Furthermore, we show how to convert a number of entangled pairs into 
a generalized GHZ state, which in turn is used to distribute control over multiple 
computers. 

We also address refreshing entanglement by observing that measurements, 
made during the primitive operations, provide crucial information for resetting 
channel qubits to |0). Hence, channel qubits can be re-entangled with other chan- 
nel qubits at a later time. 

The idea of using a cat-like state to distribute control qubits is discussed in 
sectional Next, the cat-entangler and cat-disentangler are presented. After that, 
we use these operations to construct non-local CNOT and teleportation operations. 
Section 121 discusses various constructions of non-local control gates in different sit- 
uations. Issues related to establishing and refreshing entanglement are addressed 
in section 01 via constructing entangling gates. Finally, an example of a distributed 
version of the quantum Fourier transform is presented in section 



Notation 

We adopt the notation found in 5 . For any one-qubit unitary matrix U = 
uoo uqi \ integer m £ {0,1,2,...}, the operator A„i(U) which acts on 

m + I qubits, is defined as 

. (jr-., , _ ( Uyo\xi, . . . ,x„i,0) + Uyi\xi, . . . ,Xm,l) if Af^^ Xk ^ I 

{ \xi,...,Xm,y) 11 A^^iXk^O 

for all xi, . . . , Xm,y G {0, 1}, where "A" denotes logical 'and.' 

In other words, f\m{U) is an m-fold control-fj gate. For example, if [/ = X — 

^ , then Ai(C/) is the CNOT gate, and A2{U) is the ToffoU gate. More 
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detailed discussion about this notation can be found in jSj 



Figure 1. This diagram represents an entangling gate, denoted 
by E. This gate can be locally implemented using a Hadamard 
gate and a CNOT gate, as shown. A distributed version of the 
entangling gate is discussed in section ^ 
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Figure 2. This diagram represents a generalized entangling gate 
for 3 qubits, denoted by E3. It creates the GHZ state if the input 
is the state |000). This gate can be locally implemented by a 
Hadamard gate and two CNOT gates, as shown. 



Ia> — lb> 
lb> X la> 

Figure 3. A swap gate. The discussion of distributed implemen- 
tation of this gate can be found on section lOI 



To reduce the complexity of diagrams, we introduce a diagram for an entangling 
gate, called an E gate, as shown in figure ^ Two qubits are entangled by this E 
gate. In particular, if the input state is |00), the output state will be -^(|00) + |11)). 

The entangling gate E can be generalized to an m-fold entangling gate, denoted 
by Em, which creates the cat state -^(|00 . . . 0) + |11 . . . 1)) upon the input state 

|00 . . . 0). The E3 gate is illustrated in figure [21 

Finally, figure O shows a diagram for the swap gate. Section 0] discusses the 
distributed implementations of the Em gate and the swap gate. 

2. Distributing Control via a Cat-like State 

In this section, we discuss how a cat-like state can distribute control over mul- 
tiple computers. This is the key idea of the construction of non-local interaction 
presented in this paper. We will demonstrate this idea using the simplest control 
gate, i.e., the CNOT gate. 

The CNOT gate /\i{X) is defined on a two-qubit system as follows: For any 
control qubit a|0) -|- and target bit \t), 

(2.1) Ai(X)((a|0)+/3|l))K)) = amt)+P\l)X{\t)). 

We assume that a cat-state -^(|0 . . . 0) 4- 11 . . . 1)) has already been shared between 
multiple computers. Then cat-entangler can be used to transform a control qubit 
and a cat-state into a cat-like state. As a result, equation 12.11 becomes 

(2.2) a|00...0)|t)+/3|ll...l)X(K)), 
which can be rewritten as 

(2.3) Ai(X)((a|00...0)+/3|ll...l))|t)). 

Equations 12.21 and 12.31 show that we can use any qubit, shared in the cat-like 
state, as a control line. For example, the two circuits in figure0]have an equivalent 
effect on the target bit, assuming that the state of lines one and two is the cat-like 
state a|00) +/3|11). 
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Figure 4. Since line one and line two are in the state |c) = 
q;|00) + (3\ll), the target qubit \t) can be controlled by either line 
one or two. 




Figure 5. Given a qubit |c) = a|0) and a generalized GHZ 

state -^(lOOOO) +/3|1111)), created by an entangling gate E4, one 
can create a cat-like state a|0000)+/5|llll) with the above circuit. 



A cat-entangler is shown in figure [S] The box labeled by M is a standard basis 
measurement. Since a result of the measurement (represented by a dotted line) is 
a classical bit, we can distribute and reuse the result to control many A-gatcs at 
the same time. 

Theorem 2.1. Given a qubit \c) — a\0)+f3\l) and an m-f old cat state -^{\00 ... 0, 
|11 . . . Im)), a cat-like state jV'c) = a\00 . . . Om) + (3\ll . . . Im) can be created by a 
CNOT gate, local operations (i.e., one measurement and X-gates), and classical 
communication. 

Proof: A quantum circuit that creates an m-fold cat-like state can be general- 
ized from the circuit shown in figure[Sl Assume that an m-fold GHZ state is created 
by a Era gate. 

At point A, after applying the CNOT gate, the state of the circuit is 

(2.4) -i=(a|000...0„.) +a|011...1,„) +/?|110...0,„) +/3|101...1„,)) 

After the measurement, the state is either 

(2.5a) a|000...0„.) +/3|101...1,„) 

or 

(2.5b) a|011...1™)+/3|110...0„), 

where the underlined qubit is the measured qubit. 

Assume that the result of measurement is a classical bit r. After applying X 
controlled by the classical bit r (represented by a dotted line on the X gates) , the 
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Figure 6. An input state at point A, q;|0000) + /3|1111), is trans- 
formed by a cat-discntanglcr into a state (a|0) + /3|l))|r2r3r4), 
where is the result of measurements on Hne k. The phase-flip 
gate (Z-gate) is controlled by the mod 2 sum of the r^, written by 

state at point B is 

(2.6) a\0rQ...0„,)+(3\lrl...l„,). 

Hence, an m-fold cat-like state shared between the qubit |c) and other qubits from 
the cat state (except the measured qubit) is created. 

To complete the non-local CNOT operation, the control line must be disentan- 
gled and restored. This can be accomplished by a cat-disentangier, as demonstrated 
in figure 1^1 

Theorem 2.2. A state {ijjc) — a|00...0,n) + /3|f 1 . . . f„i) can be transformed 
into a state a\0) + /3|l))|r), where r G |0, 1, . . . ,2™^-'^ — l} is the result of m 1- 
qubit measurements, by a cat-disentangler, which can be generalized from the circuit 
shown in figure 

Proof: Assume the input of the circuit is \'ipc)- After applying Hadamard 
transformations, the state of the circuit is 

(2.7) a|0) V)+P\^) E (-!)*'='■'= 

where the binary representation of r is rir2 • • • rm-i- 
After the measurements, the state becomes 

(2.8) (a|0)-f (-l)®'='^^/3|l))|r) 

where is the result of the measurement on line i -\- 1. 

To correct the phase, we use the result of the computation (Bki'k to control 
the Z gate applied to the first line. Hence, the state at the end of the circuit is 
(a|0)+/3|l))|r).^ 

By considering the circuit in figureEl the first line can be switched with another 
line. Therefore, the control state can be restored to any qubit involved in the cat- 
like state. Furthermore, we can leave more qubits to be untouched, which means 
we do not apply Hadamard and measurement on one or more qubits. As a result, 
the remaining qubits form a smaller cat-like state. 



GENERALIZED GHZ STATES AND DISTRIBUTED QUANTUM COMPUTING 



7 



2.1. Constructing a Teleportation Circuit. A teleportation circuit can 
be considered as a composition of the cat-entangler and cat-disentangler as shown 
in figure [71 



la> 
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Figure 7. This figure shows a construction of a teleportation 
circuit using the cat-entangler and cat-disentangler. An unknown 
quantum state \a) is teleported from the first line to the third line. 

If the cat-entangler is applied, then the state of qubits 1 and 3 is in the cat-like 
state ajOO) + f3\ll). We next apply the cat-disentangler. However, the unknown 
qubit I a) is restored on line 3, not on line 1. 

After carefully considering the teleportation circuit, the group of operations 
applied on the first two qubits after the entangling gate, is actually the Bell ba- 
sis transformation followed by standard basis measurement. These operations are 
equivalent to a complete Bell measurement. Furthermore, the result of the measure- 
ment is used to control the X and Z gates, as in the teleportation circuit described 
in [2]. 

3. Constructing Non-local Control Gates 

In this section, we begin by discussing the construction of a non-local CNOT 
gate. Then we show how to construct efficient distributed control gates in different 
situations. We assume that we have distributed entangling gates, which create and 
share a cat state among multiple computers. Construction of distributed entangling 
gates is described in section 0] 

3.1. Constructing a Non-local CNOT Gate. By observing that a con- 
trol line can be distributed by a cat-like state, a non-local CNOT gate can be 
implemented as shown in figure |H1 

Theorem 3.1. Given a control line |c) = a|0) -I- and an entangled pair 
-^(|00) -|- |11)) created by an entangling gate, a non-local CNOT gate can be im- 
plemented from the cat-entangler and cat-disentangler, as shown in the figure\^ 

Proof: After applying the cat-entangler, the state at point B is 

(3.1) (a|OrO)+/3|lrl))|i), 

where r denotes the result of the measurement. 

Since line 1 and line 3 are in a cat-like state a|00) -f /9|11). We can use either 
line 1 or line 3 to control the X gate. In this case, we use line 3, which is on the 
local machine. 

Because the control line does not change after applying CNOT, the state of 
line 1 and line 3 remains in a cat-like state. Therefore after applying the cat- 
disentangler, the control qubit is restored to line 1. Hence, we have completed a 
non-local CNOT operation. 
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Figure 8. After applying the cat-entangler, the state of the con- 
trol line and the third line becomes a cat-like state ajOO) -I- /3|11). 
Instead of using the control line (the first line) directly, line 3 can 
be used to control line 4. Then we apply a cat-disentangier to re- 
store the control qubit. Each computer boundary is indicated by 
a tightly dotted box. 



This circuit is proven to be the optimal implementation by [2 |4]j i-e. one 
ebit and two cbits are necessary and sufficient for implementing a non-local CNOT 
gate. Furthermore, the cat-entangler and cat-disentangler can be applied to create 
a Ai{U). In many cases, we have an efficient implementation of Ai{U). 

3.2. Constructing of Small Non-local Control Gates. In this section, 
we assume that a unitary transformation U is small enough to implement on one 
computer, but that the control line is on a different computer. Moreover, the 
transformation U is composed of a number of basic gates, i.e, U = Ui ■ U2 ■ ■ ■ Uk, 
for some integer k. 

After a cat-entangler creates a cat-like state, this establish distributed control 
between two computers. Since the control line can be reused, only one cat-entangler 
is needed. Therefore, one ebit and two cbits are needed to implement this type of 
non-local /\i{U) gate. This idea is demonstrated in figure IHI where U = Ui ■ U2 ■ 
CNOT. 

Because the control line is reused, the cost of implementing the /\i{Ui) gates 
can be shared among the basic gates. In other words, each non-local /\i{Ui) can be 
implemented using only j: ebits and | cbits, asymptotically. 

3.3. Constructing Large Non-local Control-U Gates. In this section, 
we assume that a unitary transformation U is too big to implement on one com- 
puter. Therefore, U has to be decomposed into a number of smaller transformations, 
where each transformation can be implemented on one computer. In this setting, 
there are two scenarios to consider. 

In the first scenario, we assume that there are no shared qubits between the 
components of U, i.e. U is decomposed into smaller transformations, where each 
transformation is applied to different qubits. For example, assume [/ is a unitary 
transformation for a 7-qubit system, and U — Ui ■ U2 ■ U3, where J7i is a unitary 
transformation acting on qubits 1 and 2, U2 is a unitary transformation acting on 
qubits 3, 4, and 5, and C/3 is a unitary transformation acting on qubits 6 and 7. In 
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Figure 9. Assume [/ = t/i • f/2 • CNOT. Then Ai([/) = Ai(f/i) • 
Ai(t72) ■ /\i{CNOT) can be distributed as shown. The control line 
needs to be distributed only once, because it can be reused. This 
implementation allows the cost of distributing the control qubit to 
be shared among the elementary gates. 
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Figure 10. In this case, the transformation U can be decom- 
posed into a set of small transformations. The cat-like state allows 
the control to be distributed among multiple computers, and also 
allows each computer to execute in parallel. 



this case, not only the cat-like state can be used to distribute a control line among 
three computers, but Ai(C/) can also be executed in parallel, as demonstrated in 
figure Cni 
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In the second scenario, there are shared qubits between these transformations. 
For example, assume a transformation J7 is a transformation on a 3-qubit system, 
and U = Ui ■ 1/2 ■ Us ■ U4, where Ui, U4 acts on qubit 1 and 2 and U2, U3 acts on 
qubit 2 and 3. 



L 
2 

2' 



U4 



Ul 



U3 



U2 



Figure 11. U can be decomposed into a set of smaller transfor- 
mations that can be implemented on one computer. However, they 
share qubit number 2. A distributed version of this case can be im- 
plemented by transporting or teleporting qubit number 2 back and 
forth between the top and the bottom machine. Line 2 becomes 
line 2' in the bottom computer. 



In this case, the shared qubits can be physically transported or teleported 
from one computer to another. However, sending this shared qubit introduces a 
communication overhead, which is not present in a non-distributed implementation 
of Ai(;7). 

3.4. Constructing a Am(C/) Gate. A non-local Am(C/) can simply be im- 
plemented by using m cat-like pairs to distribute m control qubits to one machine, 
and then to implement Am{U) locally, as suggested in [J. Figure IT^ shows an 
instance of A2{U). 

However, this requires the computer to have enough qubits to implement Am{U) 
locally. Therefore, the number of non-local control lines is limited by the number 
of qubits available on one computer. 

Barenco et al show that the Am_2(A^) gate can be linearly simulated on m- 
qubit system '5'. For example, for any m>5, 2<n<m — 3, a Am-2(X) gate 
can be implemented as two A„(A') gates and two A„i-n-i(A") gates on an m-qubit 
system. An implementation for the case of m = 6 and n = 2 is shown in figure [T^ 

This technique can be used to break down a large Am{U) gate into a sequence 
of smaller gates. After thus making the number of control lines small enough, we 
can distribute these control lines to one machine, and implement each gate locally. 

A distributed implementation of figure 1131 is demonstrated in figure 1141 We 
group lines 1, 2, and 5 together on the top machine. Then we distribute line 5 to 
line 5' and then use it to perform a control operation with lines 3 and 4 applied to 
line 6 in the bottom machine. 

4. Establishing and Refreshing an Entangled Pair 

Before any non-local CNOT and teleportation operations are performed, an 
entangled pair must be established between channel computers. Moreover, after the 
operation is finished, the channel qubits must be reset to refresh the pair. We discuss 
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Figure 12. Both control lines |ci) and |c2) are distributed via 
two cat-like states to the bottom machine so that A2{U) can be 
implemented locally. The swap gate moves the control line from 
the channel qubit and resets it to |0), so that the channel qubit 
can be reused. 




e — ^y— 



D 



e — e 



6^ 



Figure 13. In this case, to = 6 and n = 2. The /\i{X) gate is 
simulated by two A2 gates and two A3 gates as shown. 




Figure 14. This circuit presents a distributed version of A4(X), 
where to = 6 and n = 2. Line 5 becomes line 5' on the bottom 
machine. 
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non-local gates 

Figure 15. Assuming each machine holds a pair of channel 
qubits. Each machine entangles its channel pair, and then ex- 
changes one qubit of the pair with another machine. 



establishing and refreshing procedure in the next section via the construction of 
entangling gates. 

4.1. Constructing the E2 Gate. Because establishing entanglement can 
not be accomplished by any local operations, channel qubits must be on one ma- 
chine. The simplest way to establish an entangled pair is described as follow: 

To establish an ebit between a remote and a local computers, the remote com- 
puter sends a channel qubit to the local machine. Then, the local machine entangles 
the remote channel qubit with one of the local channel qubits. Then one qubit of 
the pair is sent back to the remote machine. This process establishes one ebit by 
sending two qubits, one in each direction. 

If we allow each machine to have two channel qubits, then two ebits can be 
established by sending two qubits. To do so, each computer entangles its own 
channel qubits, then exchanges one qubit of the pair with the other computer, as 
demonstrated in figure E| As a result, one ebit is established with the cost of 
sending one qubit, asymptotically. To refresh the entanglement, the procedure is 
simply repeated. However, the channel qubits must be reset to state |0) before they 
can be re-entangled. 

Channel qubits need to be in the state |0) before being entangled. Therefore 
the channel qubits must be reset to the state |0) before the entangled pair can be 
refreshed. Because the measurement result reveals the current state of the measured 
qubit, we can use the measurement result to control the X-gdXe to set the channel 
qubit to |0), as shown in figure [TBI 

4.2. Constructing the Em Gate. A non-distributed version of Em gate 
can be linearly implemented using a Hadamard gate, and a sequence of CNOT 
gates as shown in figure|21(for three qubits case.) However, it takes 0{m) steps to 
create an m-fold GHZ state. An efficient way of implementing a non-distributed 
version of an Em gate is shown in figure ITTI This implementation utilizes a binary 
tree idea to reduce the number of steps from 0{m) to O(logm). 

A distributed version of Em gate can be implemented by using non-local CNOT 
gates. In this process, we exchange m— 1 ebits with the m-fold cat state by replacing 
CNOT gates with non-local CNOT gates. 

In the linear implementation of Em, each computer requires at least two channel 
qubits to establish entanglement with its neighbors. This number is fixed regardless 
of the number of computers involved. 



GENERALIZED GHZ STATES AND DISTRIBUTED QUANTUM COMPUTING 



13 







z - 






H 
H 
H 




M 
M 
M 




X 
















X 
















X 













A 



Figure 16. This figure demonstrates how the channel qubits can 
be reset to |0) by considering the result of the measurement. As- 
sume the input state at point A is a|0000) + /9|1111). The output 
of the circuit is the state {a\0) + /3|1))|000). 




Figure 17. This figure represent an cfhcicnt implementation of 
Gs, using a Hadamard gate and 7 CNOT gates. A generalized 
GHZ state can be created after applying O(logTO) gates. However, 
this implementation requires more channel qubits. 



On the contrary, in the binary implementation, the number of channel qubits 
on each computer increases with respect to m. In fact, the number of channel qubits 
required is O(logm). This is true because the first computer has to simultaneously 
establish O(logTO) entangled pairs with others computers, as shown in figure [T7I 
before an m-fold cat state is created. However, the trade off is justified because the 
number of steps required to establish the m-fold cat state is reduced from 0{m) to 
0(log m). 

4.3. Re-constructing a Teleportation Circuit. In the implementation 
of a non-local CNOT gate, measurements are performed on both channel qubits. 
Therefore, the channel qubits can be reset by considering the measurement results. 
In the teleportation circuit, the remote channel qubit is not measured. Moreover, 
it holds the unknown state which is teleported by the circuit. Pati and Braunstein 
show in inj that an unknown qubit can not be deleted. To solve this problem, we 
can swap the remote channel qubit with |0) held by another qubit in the register on 
the same machine. By swapping, the unknown qubit is preserved and the channel 
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qubit is reset. The construction of the teleportation circuit, with channel qubits 
reset, is shown in figure IT^ 




^la> 



Figure 18. The remote channel qubit can be reset by swapping 
with a empty qubit (|0)). In addition, after teleportation, the 
original qubit is reset to |0) state. 



One remaining question is how many |0)'s (or empty qubits) are needed to 
support the teleportation protocol. We observe that, after teleportation, the orig- 
inal unknown qubit on the source machine is reset to state |0). Therefore, we can 
use this qubit as an empty qubit in the next inbound teleportation. However, the 
required number of empty qubits still depends on how many qubits are teleported 
into the machine before another qubit is teleported out. In other words, we can 
consider the |0) qubit as an empty space. To teleport a qubit in, we need an empty 
space to receive the qubit. The more qubits teleported into the machine, the more 
spaces needed. 

4.4. Constructing a Distributed Swap Gate. A distributed swap gate 
can be implemented by using two teleportations to exchange two qubits. If each 
machine has two channel qubits involved in a swapping operation, we do not need 
any empty qubits in the registers. The channel qubit becomes a swapping buffer, 
because the first teleportation creates an empty qubit for the second teleportation 
to use. However, if we have only one channel qubit in a swapping operation, one 
empty qubit is needed for a swapping buffer. 

In 13], Collins, Linden, and Popescu show that an optimal non-local swap gate 
requires two ebits and two cbits in each direction. Therefore, implementing a swap 
gate by two teleportations is an optimal implementation. 

5. Distributed Fourier Transform 

The quantum Fourier transform is a unitary transformation defined on basis 
states as follows, 

(5.1) 1^-)^ 1 ^e^-^-^/2"|fc), 

^ ^ k=0 

where n is the number of qubits. See [2j I10| for more details. 

Shor presents the quantum Fourier transformation in as a routine used in the 
factoring algorithm. Later on, it becomes a standard routine used in many quantum 
algorithms, such as phase estimation, and the hidden subgroup algorithms. An 
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H - R2 



R(n-l) - Rn 



R(n-2) - R(n-l) 



H - R2 



Figure 19. The Fourier transformation is a sequence of 
Hadamard and control-i?fc gates. At the end, a swapping gate 
is used to reset the order of the qubits. 




Figure 20. By applying non-local control Rk gates, we construct 
a distributed Fourier transformation. This figure shows the dis- 
tributed Fourier transformation for 4 qubits which is implemented 
on two machines. The swap gate can be implemented by teleport- 
ing qubits back and forth between two computers. 



efficient circuit for the quantum Fourier transformation, found in |10| . is shown in 
figure El The gate Rk is defined as follows, 

(5-2) ^ ( e^^'/^" ) ' 

where fc e {2, 3, . . .}. 

We use the construction of Ai([/) to implement a distributed version of the 
Fourier transformation. The swap gate can be implemented by teleporting qubits 
back and forth between two computers. 

The communication resources needed depend on how many non-local control 
gates have to be implemented. Assume that we implement the Fourier transforma- 
tion on m computers, and each computer has k qubits, not including the channel 
qubits. Let n = mk. Therefore, there are {n — l)n/2 control gates to implement. 
Hence, for each computer, there are (fc — l)fc/2 local control gates. The number of 
local control gates is m(A:— l)fc/2 = {n/m—\)n/2. The number of non-local control 
gates is [n — l)n/2 — [n/m — l)n/2 gates. Hence, the communication resources are 
0(n2). 

An implementation of the distributed Fourier transformation of 4 qubits is 
shown in figure HUl 
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6. Conclusion 

The teleportation circuit and the non-local CNOT gate can be implemented 
using two primitive operations, cat-entangler and cat-disentangler. These two prim- 
itive operations work on the idea that a cat-like state can be used to distribute a 
control line. This principle is extended to construct non-local operations efhciently 
in many cases. 

For example, the communication cost can be shared among elementary gates 
because a control line can be reused after being distributed via cat-like state. More- 
over, the cat-like state allows parallel implementation of some control gates. 

Before using a non-local operation, an entanglement has to be established. 
Moreover, it has to be refreshed after being used. Fortunately, by looking at the 
measurement results, we can reset channel qubits, and re-establish entanglement. 
This procedure works well with non-local CNOT gates. To reset channel qubits 
in the teleportation operation, the channel qubits have to be swapped with empty 
qubits (|0)). 

In general, if we can break down a unitary transformation into a sequence of 
CNOT gates and one-qubit gates, a distributed version of the unitary transforma- 
tion can be simply implemented by replacing CNOT gates with non-local CNOT 
gates. Therefore, the communication overhead is dependent on the number of non- 
local CNOT gates needed to be implemented. With help from teleportation to 
move qubits back and forth between computers, better distributed implementation 
can be accomplished. 
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